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THERMOSTABLE POLYMERASES HAVING ALTERED FIDELITY 

This application claims the benefit of priority 
of United States Provisional Application serial No. 
5 60/031,496, filed November 27, 1996, the entire contents 
of which is incorporated herein by reference. 

This invention was made with government support 
under grant number OIG-R35-CA-39903 awarded by the 
National Institutes of Health and grant number BIR9214821 
10 awarded by the National Science Foundation. The 
government has certain rights in the invention. 

BACKGROUND OF THE INVENTION 

The present invention relates generally 
thermostable polymerases and more specifically to 
15 for identifying polymerase mutants having desired 
fidelity. 

Every living organism requires genetic 
material, deoxyribonucleic acid (DNA) , to pass a unique 
collection of characteristics to its offspring. Genes 

20 are discreet segments of the DNA and provide the 

information required to generate a new organism. Even 
simple organisms, such as bacteria, contain thousands of 
genes, and the number is many fold greater in complex 
organisms such as humans. Understanding the complexities 

25 of the development and functioning of living organisms 
requires knowledge of these genes. However, the amount 
of DNA that can be isolated for study has often been 
1 i mit inq . 



to 

methods 
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A major breakthrough in the study of genes was 
the development of the polymerase chain reaction (PGR) . 
PCR amplifies genes or portions of genes by making many 
identical copies, allowing isolation of genes from very 
5 tiny amounts of DNA. The motors for PCR are DNA 

polymerases that copy the DNA of each gene during each 
round of DNA synthesis. Using oligonucleotides that 
determine the start and termination of DNA synthesis, a 
single gene can be replicated into millions of copies. 
10 This process has created a revolution in biotechnology 
and has been used extensively for the identification of 
mutant genes that are responsible for or associated with 
inheL i Led humdii diseases. It is nOw possible to identify 

a mutant gene in a single cell, amplify the gene a 
15 million times, and establish the nature of the mutation. 
One application of identifying a mutant gene is the 
determination of genetic susceptibility to disease, which 
can be mapped by gene amplification and DNA sequencing. 



DNA polymerases function in cells as the 
20 enzymes responsible for the synthesis of DNA. They 
polymerize deoxyribonucleoside triphosphates in the 
presence of a metal activator, such as Mg 2+ , in an order 
dictated by the DNA template or polynucleotide template 
that is copied. Even though the template dictates the 
25 order of nucleotide subunits that are linked together in 
the newly synthesized DNA, these enzymes also function to 
maintain the accuracy of this process. The contribution 
of DNA polymerases to the fidelity of DNA synthesis is 
mediated by two mechanisms. First, the geometry of the 
30 substrate binding site in DNA polymerases contributes to 
the selection of the complementary deoxynucleoside 
triphosphates. Mutations within the substrate binding 
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proof-reading 3 1 -5 ' exonuclease that preferentially and 
immediately excises non-complementary deoxynucleoside 
triphosphates if they are added during the course of 
synthesis. As a result, these enzymes copy DNA in vitro 
5 with a fidelity varying from 5 X 10" 4 (1 error per 2000 
bases) to 10~ 7 (1 error per 10 7 bases) (Fry and Loeb, 
Animal Cell DNA Polymerases , pp. 221, CRC Press, Inc., 
Boca Raton, FL.(1986); Kunkel, T.A., J. Biol. Chem. 
267 : 18251-18254 (1992) ) . 

10 In vivo, DNA polymerases participate in a 

spectrum of DNA synthetic processes including DNA 
x epliod Lion , DNA repair, recombination, and yene 
amplification (Kornberg and Baker, DNA Replication , pp. 
929, W.H. Freeman and Co., New York (1992)). During each 

15 DNA synthetic process, the DNA template is copied once or 
at most a few times to produce identical replicas. In 
vitro DNA replication, in contrast, can be repeated many 
times, for example, during PCR. 

In the initial studies with PCR, the DNA 
20 polymerase was added at the start of each round of DNA 
replication. Subsequently, it was determined that 
thermostable DNA polymerases could be obtained from 
bacteria that grow at elevated temperatures, and these 
enzymes need to be added only once. At the elevated 
25 temperatures used during PCR, these enzymes would not 
denature. As a result, one can carry out repetitive 
cycles of polymerase chain reactions without adding fresh 
enzymes at the start of each synthetic addition process. 
The commercial market for the sale of DNA polymerases 
30 from thermostable organisms can be conservatively 
estimated at 200 million dollars per year. DNA 
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the key to a large number of techniques in recombinant 
DNA studies and in medical diagnosis of disease. 

Due to the importance of DNA polymerases in 
biotechnology and medicine, it would be highly 
advantageous to generate DNA polymerases having desired 
enzymatic properties such as altered fidelity. However, 
the ability to predict the effect of introducing an amino 
acid mutation into the sequence of a protein remains very 
limited. Even when structural information is available 
for the protein of interest, it is often very difficult 
to predict the effect of mutations of specific amino acid 
residues on the function ot that protein. In particular, 
it is extremely difficult to predict amino acid 
substitutions that will alter the activity of an enzyme 
to achieve a desirable change. 

Despite the limitations in predicting the 
effect of introducing amino acid substitutions into 
proteins, a number of mutant DNA polymerases have been 
discovered, or have been created by site-specific 
mutagenesis, and have been used in PCR amplification 
(Tabor and Richardson, Proc. Natl. Acad. Sci. USA 
92:6339-6343 (1995)). Some of these mutant polymerases 
offer particular advantages with respect to 
thermostability, processivity , length of the newly 
synthesized DNA product, or fidelity of DNA synthesis. 
Those that are more accurate for the most part contain a 
3 '-5' exonuclease activity that removes misincorporated 
bases prior to adding the next nucleotide during DNA 
synthesis. However, the current spectrum of mutant DNA 
polymerases is quite limited. For the most part, these 
mutants have been obtained by introducing a single base 
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Steitz, Annu. Rev. Biochem. 63:777-822 (1994)). These 
laborious and step-wise procedures have been necessary 
due to the lack of adequate knowledge to predict the 
effects of most single amino acid substitutions and due 
5 to the lack of rules for predicting the effects of 
multiple simultaneous substitutions . 

Thus, there exists a need for rapid and 
efficient methods to produce and screen for modified 
polymerases having desired fidelity in polynucleotide 
10 synthesis. The present invention satisfies this need and 
provides related advantages as well. 

SUMMARY OF THE INVENTION 

The present invention provides a method for 
identifying a thermostable polymerase having altered 

15 fidelity. The method consists of generating a random 

population of polymerase mutants by mutating at least one 
amino acid residue of a thermostable polymerase and 
screening the population for one or more active 
polymerase mutants by genetic selection. For example, 

20 the invention provides a method for identifying a 
thermostable polymerase having altered fidelity by 
mutating at least one amino acid residue in an active 
site 0-helix of a thermostable polymerase. The invention 
also provides thermostable polymerases and nucleic acids 

25 encoding thermostable polymerases having altered 

fidelity, for example, high fidelity polymerases and low 
fidelity polymerases. The invention additionally 
provides a method for identifying one or more mutations 
in a gene by amplifying the gene with a high fidelity 

30 polymerase. The invention further provides a method for 
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provides a method for diagnosing a genetic disease using 
a high fidelity polymerase mutant. The invention further 
provides a method for randomly mutagenizing a gene by 
amplifying the gene using a low fidelity polymerase 
5 mutant. 



BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 shows the nucleotide and amino acid 
sequence of Taq DNA polymerase I (SEQ ID NOS : 1 and 2, 
respectively) . 



10 figure 2 shows a compilation of amino acid 

substitutions identified in a screen of Taq DNA 
polymerase I mutants. Panel A shows single mutations, 
which were identified in the screen of a 9% library, 
listed under the wild type amino acids. Panel B shows 

15 the sequence of multiply substituted mutants identified 
in the screen of a 9% library. Panel C shows mutations 
selected from a totally random library of selected amino 
acids . 



Figure 3 shows the spectrum of single base 
20 changes generated in a forward mutation assay by Taq DNA 
polymerase I mutant Thr664Arg. 



DETAILED DESCRIPTION OF THE INVENTION 



The invention is directed to methods for 
screening and identifying thermostable polymerases that 
25 have altered fidelity of DNA synthesis as well as to the 
resultant polymerase compositions. As disclosed herein, 
the invention provides rapid and efficient methods to 
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polymerase mutants having a desired activity such as high 
fidelity or low fidelity. An advantage of the methods is 
that they use a population of polymerase mutants to 
rapidly identify active polymerase mutants having altered 
fidelity. The identification of low fidelity mutants is 
useful for introducing mutations into specific genes due 
to the increased frequency of misincorporation of 
nucleotides during error-prone PCR amplification. The 
identification of high fidelity mutants is useful for PCR 
amplification of genes and for mapping of genetic 
mutations. The methods of the invention can therefore be 
advantageously applied to the identification of 
polymerase mutants useful for the characterization of 
specific genes and for the identification and diagnosis 
of human genetic diseases. 

As used herein, the term "polymerase" is 
intended to refer to an enzyme that polymerizes 
nucleoside triphosphates. Polymerases use a template 
nucleic acid strand to synthesize a complementary nucleic 
acid strand. The template strand and synthesized nucleic 
acid strand can independently be either DNA or RNA. 
Polymerases can include, for example, DNA polymerases 
such as Escherichia coli DNA polymerase I and Thermus 
aquaticus (Taq) DNA polymerase I, DNA-dependent RNA 
polymerases and reverse transcriptases. The polymerase 
is a polypeptide or protein containing sufficient amino 
acids to carry out a desired enzymatic function of the 
polymerase. The polymerase need not contain all of the 
amino acids found in the native enzyme but only those 
which are sufficient to allow the polymerase to carry out 
a desired catalytic activity. Catalytic activities 
include, for example, 5 '-3' polymerization, 5 ' -3 1 
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As used herein, the term "polymerase mutant" is 
intended to refer to a polymerase that contains one or 
more amino acids that differ from a selected polymerase. 
The selected polymerase is determined based on desired 
enzymatic properties and is used as a parent polymerase 
to generate a population of polymerase mutants. A 
selected polymerase can be, for example, a wild type 
polymerase as isolated from an organism or can be a 
mutant polymerase that differs from a wild type 
polymerase by one or more amino acids and has desirable 
enzymatic properties. As disclosed herein, a 
thermostable polymerase such as Taq DNA polymerase I can 
be selected, for example, as a polymerase to generate a 
population of polymerase mutants. 

As used herein, the term "population" is 
intended to refer to a group of two or more different 
molecular species. Molecular species differ by some 
detectable property such as a difference in at least one 
amino acid residue or at least one nucleotide residue or 
a difference introduced by the modification of an amino 
acid such as the addition of a chemical functional group. 
For example, a population of polymerase mutants would 
contain two or more different polymerase mutants. 
Typically, populations can be as small as two species and 
as large as 10 12 species. In some embodiments, 
populations are between about five and 20 different 
species as well as up to hundreds or thousands of 
different species. In other embodiments, populations can 
be, for example, greater than 10\ 10 5 and 10 6 different 
species. In the specific example presented in Example I, 
the population described therein is 50,000 different 
species. In yet other embodiments, populations are 
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diversity of a population sufficient for a particular 
application . 



A population of polymerase mutants consists of 
two or more mutant polymerases which differ by at least 
5 one amino acid from the parent polymerase. A population 
of polymerase mutants can consist, for example, of 
multiple substitutions of a single amino acid residue 
where the substitutions are changes to any or all of the 
non-parental, naturally occurring amino acids at that 

10 amino acid position. In this example, the population 
would comprise nineteen members, and all members of the 
polymerase mutant population would consist of nineteen 
different amino acid substitutions at a single amino acid 
position. A population of polymerase mutants can also 

15 consist, for example, of at least one substitution at two 
or more different amino acid positions. In this example, 
a minimal population containing two polymerase mutants 
would consist of a single amino acid substitution at two 
different positions. Such a population can be expanded 

20 with the addition of substitutions to any or all of the 
19 non-parental amino acids at these two amino acid 
positions or additional amino acid positions. 

As used herein, the term "random" when used in 
reference to a population is intended to refer to a 

25 population of molecules generated without limiting the 
molecules to contain predetermined specific residues. 
Such a population excludes molecules in which a specific 
residue is substituted with a specific predetermined 
residue and individually assayed to determine its 

30 activity. The residues can be amino acid residues or 
nucleotide residues encoding a codon. The random 
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encodes an amino acid sequence of a protein region of 
interest (see Example I). Thus, a random population is 
generated to contain random oligonucleotide sequences 
which can be expressed in appropriate cells to generate a 
random population of expressed proteins. A specific 
example of such a random population is the population of 
polymerase mutants described in Example I that were 
generated to screen for active polymerase mutants having 
altered fidelity. 



As used herein, the term "catalytic activity" 
or "activity 11 when used in reference to a polymerase is 
intended to refer to the enz y ma tie p r op ert-ies of the 
polymerase. The catalytic activity includes, for 
example: enzymatic properties such as the rate of 
synthesis of nucleic acid polymers; the K,^ for substrates 
such as nucleoside triphosphates and template strand; the 
fidelity of template-directed incorporation of 
nucleotides, where the frequency of incorporation of 
non-complementary nucleotides is compared to that of 
complementary nucleotides; processivity, the number of 
nucleotides synthesized by a polymerase prior to 
dissociation from the DNA template; discrimination of the 
ribose sugar; and stability, for example, at elevated 
temperatures. Polymerases can discriminate between 
templates, for example, DNA polymerases generally use DNA 
templates and RNA polymerases generally use RNA 
templates, whereas reverse transcriptases use both RNA 
and DNA templates. DNA polymerases also discriminate 
between deoxyribonucleoside triphosphates and 
dideoxyribonucleoside triphosphates. Any of these 
distinct enzymatic properties can be included in the 
meaning of the term catalytic activity, including any 
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identifying polymerase mutants having altered fidelity 
are exemplified herein, the methods of the invention can 
similarly be applied to identify polymerases having 
altered catalytic activity distinct from altered 
fidelity . 

As used herein, the term "fidelity" when used 
in reference to a polymerase is intended to refer to the 
accuracy of template-directed incorporation of 
complementary bases in a synthesized DNA strand relative 
to the template strand. Fidelity is measured based on 
the frequency of incorporation of incorrect bases in the 
newly synthesized nucleic acid strand. The incorpui.ciLion 
of incorrect bases can result in point mutations, 
insertions or deletions. Fidelity can be calculated 
according to the procedures described in Tindall and 
Kunkel f Biochemist ry 27:60nfl-fini3 (1988)). Methods for 
determining fidelity are well known in the art and 
include, for example, those described in Example III. A 
polymerase or polymerase mutant can exhibit either high 
fidelity or low fidelity. As used herein, the term "high 
fidelity" is intended to mean a frequency of accurate 
base incorporation that exceeds a predetermined value. 
Similarly, the term "low fidelity" is intended to mean a 
frequency of accurate base incorporation that is lower 
than a predetermined value. The predetermined value can 
be, for example, a desired frequency of accurate base 
incorporation or the fidelity of a known polymerase. 

As used herein, the term "altered fidelity" 
refers to the fidelity of a polymerase mutant that 
differs from the fidelity of the selected parent 
polymerase from which the polymerase mutant is derived. 
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polymerase mutants with altered fidelity can be 
classified as high fidelity polymerases or low fidelity 
polymerases. Altered fidelity can be determined by 
assaying the parent and mutant polymerase and comparing 
their activities using any assay that measures the 
accuracy of template directed incorporation of 
complementary bases. Such methods for measuring fidelity 
include, for example, those described in Example III as 
well as other methods known to those skilled in the art. 

As used herein, the term "immutable" when used 
in reference to an amino acid residue is intended to 
refer to an amino acid residue which cannot be 
substituted with another amino acid residue and still 
retain measurable function of the polypeptide. An 
immutable amino acid residue can be determined by 
introducing one or more substitutions of an amino acid 
residue and assaying the resulting mutant polypeptides 
for polypeptide function. An immutable residue can be 
identified, for example, using site-directed mutagenesis 
to substitute each of the 19 non-parental amino acids at 
a given position and determining if any of these mutants 
are active. Random mutagenesis can also be employed to 
introduce substitutions of each of the nineteen, 
naturally occurring non-parental amino acids at a given 
position. Random mutagenesis can provide a statistical 
representation of all 20 amino acids at a given position. 
Sequencing of polymerase mutants allows determination of 
whether a given amino acid residue can tolerate any 
mutations. Assays for determining the function of mutant 
polypeptides include in vitro enzymatic assays as well as 
genetic complementation assays such as those described in 
Example I. If substitution of an amino acid residue with 
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function, then that amino acid residue is considered to 
be immutable. 

As used herein, the term "nearly immutable" 
when used in reference to an amino acid residue is 
intended to refer to an amino acid residue which can only 
tolerate conservative substitutions and still retain 
polypeptide function. Conservative amino acids are known 
to those skilled in the art and include those amino acids 
which have similar structure and chemical properties. 
Conservative substitutions of amino acids include, for 
example, the identification of amino acid substitutions 
based on the frequencies of amino acid changes between 
corresponding proteins of homologous organisms (Schulz 
and Schirmer, Principles of Protein Structure. Springer 
Verlag, New York (1979)). 

As used herein, the term "substantially" or 
"substantially the same" when used in reference to a 
nucleotide or amino acid sequence is intended to mean 
that the function of the polypeptide encoded by the 
nucleotide or amino acid sequence is essentially the same 
as the referenced parental nucleotide or amino acid 
sequence. For example, changes in a nucleotide or amino 
acid sequence that results in substitution of amino acids 
that differ from the parent molecule but that do not 
alter the desired activity of the encoded polypeptide 
would result in substantially the same sequence. A 
nucleotide or amino acid sequence is substantially the 
same if the difference in that sequence from the 
reference parental sequence does not result in any 
measurable difference in the desired activity of the 
encoded polypeptide . 
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The invention provides a method for identifying 
a thermostable polymerase having altered fidelity. The 
method consists of generating a random population of 
polymerase mutants by mutating at least one amino acid 
5 residue of a thermostable polymerase and screening the 
population for one or more active polymerase mutants by 
genetic selection. 



The generation and identification of 
polymerases having altered fidelity or altered catalytic 

10 activity is accomplished by first creating a population 
of mutant polymerases through random sequence mutagenesis 
of regions within the polymerase that can influence the 
fidelity of polymerization (Loeb, L.A., Adv. Pharmacol. 
35:321-347 (1996)). The identification of active mutants 

15 is performed in vivo and is based on genetic 

complementation of conditional polymerase mutants under 
non-permissive conditions. Once identified, the active 
polymerases are then screened for fidelity of 
polynucleotide synthesis . 



20 The methods of the invention employ a 

population of polymerase mutants and the screening of the 
polymerase mutant population to identify an active 
polymerase mutant. Using a population of polymerase 
mutants is advantageous in that a number of amino acid 

25 substitutions including single amino acid and multiple 

amino acid substitutions can be examined for their effect 
on polymerase fidelity. The use of a population of 
polymerase mutants increases the probability of 
identifying a polymerase mutant having a desired 

30 fidelity. 
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make predictions about the effect of specific amino acid 
substitutions on the activity of the polymerase. The 
substitution of single amino acids has limited 
predictability as to its effect on enzymatic activity and 
5 the effect of multiple amino acid substitutions is 

virtually unpredictable. The methods of the invention 
allow for screening a large number of polymerase mutants 
which can include single amino acid substitutions and 
multiple amino acid substitutions. In addition, using 
10 screening methods that select for active polymerase 
mutants has the additional advantage of eliminating 
inactive mutants that could complicate screening 
procedures that require purification of polymerase 
mutants to determine activity. 



15 Moreover, the methods of the invention allow 

for targeting of amino acid residues adjacent to 
immutable or nearly immutable amino acid residues. 
Immutable or nearly immutable amino acid residues are 
residues required for activity, and those immutable 

20 residues located in the active site provide critical 
residues for polymerase activity. Mutating amino acid 
residues adjacent to these required residues provides the 
greatest likelihood of modulating the activity of the 
polymerase. Introducing random mutations at these sites 

25 increases the probability of identifying a mutant 

polymerase having a desired alteration in activity such 
as altered fidelity. 



A polymerase is selected as a parent polymerase 
to introduce mutations for generating a library of 
30 mutants. Polymerases obtained from thermophlic organisms 
such as Thermus aquaticus have particularly desirable 
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are stable and retain activity at temperatures greater 
than about 37°C, generally greater than about 50°C, and 
particularly greater than about 90°C. The use of the 
thermostable polymerase Taq DNA polymerase I as a parent 
5 polymerase to generate polymerase mutants is disclosed 
herein (see Example I). 



Although a specific embodiment using Taq DNA 
polymerase I is disclosed in the examples, the methods of 
the invention can similarly be applied to other 

10 thermostable polymerases other than Thermus aquaticus DNA 
polymerases. Such other polymerases include, for 
example, KNA polymerases from Thermus aquaticus and RNA 
and DNA polymerases from other thermostable bacteria. 
Using the guidance provided herein in reference to DNA 

15 polymerases, those skilled in the art can apply the 
teachings of the invention to the generation and 
identification of these other polymerases having altered 
fidelity of polynucleotide synthesis. 



In addition to creating mutant DNA polymerases 
20 from organisms that grow at elevated temperatures, the 

methods of the invention can similarly be applied to non- 
thermostable polymerases provided that there is a 
selection or screen such as the genetic complementation 
of a conditional polymerase mutation as described herein 
25 (see Example I). Such a selection or screen of a non- 
thermostable polymerase can be, for example, the 
inducible or repressible expression of an endogenous 
polymerase. Polymerases having altered fidelity can 
similarly be generated and selected from both prokaryotic 
30 and eukaryotic cells as well as viruses. Those skilled 
in the art will know how to apply the teachings described 
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fidelity from such other organisms and such other cell 
types . 



Thus, the invention provides a general method 
for the production of a polymerase that has an altered 
5 fidelity in DNA or RNA synthesis. The method consists of 
producing a population of sufficient size and diversity 
so as to contain at least one polymerase molecule having 
an altered fidelity and then screening that population to 
identify the polymerase having altered fidelity. The 
10 altered polymerase fidelity can be either an increase or 
decrease in the accuracy of DNA synthesis. 



In one embodiment, the invention involves the 
production of a relatively large population of randomly 
mutagenized nucleic acids encoding a polymerase and 

15 introduction of the population into host cells to produce 
a library. The mutagenized polymerase encoding nucleic 
acids are expressed, and the library is screened for 
active polymerase mutants by complementation of a 
temperature sensitive mutation of an endogenous 

20 polymerase. Colonies which are viable at the 
non-permissive temperature are those which have 
polymerase encoding nucleic acids which code for active 
mutants . 



To generate a random population of polymerase 
25 mutants, a random sequence of nucleotides is substituted 
for a defined target sequence of a plasmid-encoded gene 
that specifies a biologically active molecule. In one 
application of this procedure, a double-stranded 
oligodeoxyribonucleotide is provided by hybridizing two 
30 partially complementary oligonucleotides, one or both of 
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in by DNA polymerase, cut at restriction sites and 
ligated into a DNA vector. The plasmid encodes the gene 
for a thermostable DNA polymerase, and the 
oligonucleotide is inserted in place of a portion of the 
5 gene that modulates the fidelity of DNA synthesis. After 
ligation, the reconstructed plasmids constitute a library 
of different nucleic acid sequences encoding the 
thermostable DNA polymerase and polymerase mutants. 



As disclosed herein, a genetic screen can be 

10 used to identify active polymerase mutants having altered 
fidelity. The library of nucleic acid sequences encoding 
polymerase and polymerase mutants are transrected into a 
bacterial strain such as E. coli strain recA718 polA12, 
which contains a temperature sensitive mutation in DNA 

15 polymerase. Exogenous DNA polymerases have been shown to 
functionally substitute for £. coli DNA polymerase I 
using E. coli strain recA718 polA12 and to complement the 
observed growth defect at elevated temperature, 
presumably caused by the instability of the endogenous 

20 DNA polymerase I at elevated temperatures (Sweasy and 

Loeb, J. Biol. Chem. 267:1407-1410 (1992); Kim and Loeb, 
Prg g t Na tl. ACfrd. Sci USA 92:684-688 (1995)). It was 
unknown, however, whether a thermostable polymerase could 
substitute for E. coli DNA polymerase given the distinct 

25 and harsh environment experienced by thermophilic 

organisms in which enzymes must function at extremely 
high temperatures. As disclosed herein, wild type Taq 
DNA polymerase I was found to complement the growth 
defect of E. coli strain recA718 polA12 (see Example I) . 

30 Using such a complementation system, various mutant Taq 
DNA polymerase I mutants were identified in host bacteria 
that harbor plasmids encoding active thermoresistant DNA 
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formation at elevated (restrictive) temperatures (see 
Examples I and II). 

The invention also provides a method for 
identifying a thermostable polymerase having altered 
fidelity. The method consists of generating a random 
population of polymerase mutants by mutating at least one 
amino acid residue in an active site O-helix of a 
thermostable polymerase and screening the population for 
one or more active polymerase mutants. 

The invention additionally provides a method 
for identifying a thermostable polymerase having altered 
catalytic activity. The method consists of generating a 
random population of polymerase mutants by mutating at 
least one amino acid residue of a thermostable polymerase 
and screening the population for one or more active 
polymerase mutants. 

A random population of polymerase mutants is 
generated by mutating one or more amino acid residues in 
an active site O-helix target sequence of a thermostable 
polymerase. The O-helix has been postulated to interact 
with the substrate template complex (Joyce and Steitz, 
supra, ( 1994 )) . The O-helix has been observed in the 
crystal structure of E. coli DNA polymerase I Klenow 
fragment and Taq DNA polymerase (Beese et al . , Science 
260:352-355 (1993); Kim et al., Nature 376:612-616 
(1995)). As disclosed in Example II, random sequences 
were substituted for nucleotides encoding amino acids 
Arg659 through Tyr671 of the O-helix of Taq DNA 
polymerase I to generate a random population of 
polymerase mutants . 
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Using a genetic complementation screen, a 
variety of active Taq DNA polymerase I mutants were 
identified (see Example II). Several amino acid residues 
were found to be immutable or nearly immutable based on 
the complementation assay. These immutable or nearly 
immutable amino acid residues in the O-helix are Arg659, 
Lys663, Phe667 and Tyr671. As used herein, a wild type 
amino acid is designated as a residue preceding the 
number of the amino acid position. A mutated amino acid 
is designated as a residue following the number of the 
amino acid position. These immutable or nearly immutable 
sites are unable to be altered and still maintain the 
function of the DNA polymerase. Due to their position in 
the active site O-helix of Taq DNA polymerase I, these 
immutable or nearly immutable residues provide critical 
residues that are required for the activity of the 
polymerase . 

In addition to the O-helix of a polymerase, 
other regions of the polymerase can be targeted for 
random mutagenesis to generate a library of polymerase 
mutants to identify polymerase mutants having altered 
fidelity. Those skilled in the art can determine other 
regions to target for mutagenesis. Such other regions 
can be identified, for example, by sequence homology to 
other polymerases, which suggests conservation of 
function. Conserved sequences can also be used to 
identify target regions for mutagenesis based on activity 
studies of other polymerases. Protein structural models 
revealing the convergence of amino acid residues at the 
active site of a polymerase can similarly be used to 
identify target regions for mutagenesis. 
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critical for polymerase function. Sequences containing 
these critical amino acid residues are target sequences 
for introducing random mutations to identify mutants 
having altered fidelity. Methods for identifying 
5 critical amino acid residues by introducing a small 

number of random mutations throughout a gene segment are 
well known to those skilled in the art and include, for 
example, copying by mutagenic polymerases, exposure of 
templates to DNA damaging agents prior to inserting into 
10 cells and replacement of regions of the DNA template with 
oligonucleotides containing sparsely populated random 
inserts. For example, a population of oligonucleotides 

Wl LU _/ J. O W J_ J_ ^ ^ <J <^ JL. LWiltJ Uii^ S V WJ- v_ i » s_ 

non-complementary nucleotides at each position can be 
15 generated. Screening for polymerase mutants can be 

performed, for example, with the genetic complementation 
assay disclosed herein. 

The invention also provides a method for 
identifying a thermostable polymerase having altered 

20 fidelity. The method consists of generating a random 

population of polymerase mutants by mutating one or more 
amino acid residues adjacent to an immutable or nearly 
immutable residue in an active site O-helix of a 
thermostable polymerase and screening the population for 

25 one or more active polymerase mutants. 

In one embodiment, substitutions at amino acids 
adjacent to immutable or nearly immutable residues are 
used to identify polymerase mutants having altered 
fidelity. The adjacent amino acid residues can be 
30 immediately adjacent in the linear sequence or can be 

nearby. Adjacent residues that are nearby can be as many 
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residue can also be nearby in the three-dimensional 
structure of the polymerase and can be determined from a 
crystallographic molecular model of a polymerase. Nearby 
residues are in close enough proximity to an immutable or 
5 nearly immutable residue to modulate the activity of the 
polymerase. Generally, nearby residues are within two 
amino acid residues in the linear sequence from an 
immutable or nearly immutable residue or are within about 
sA of the immutable or nearly immutable residues, in 
10 particular within about 3A. 

Substitutions involving amino acid residues 
da j dctn L Lo xiiLiTiu table or nearly immutable sites have been 
found to alter the fidelity of DNA synthesis (see 
Examples IV and V) . The identified immutable or nearly 
immutable amino acid residues correspond to amino acid 
residues Arg659, Lys663, Phe667 and Tyr671 of Taq DNA 
polymerase I. Thus, the invention is directed to 
altering one or more amino acid residues adjacent to an 
amino acid residue corresponding to Arg659, Lys663, 
Phe667 or Tyr671 in Taq DNA polymerase. Amino acid 
residues adjacent to these immutable residues include, 
for example, amino acids corresponding to Arg660, Ala661, 
Ala662, Thr664, Ile665, Asn666, Gly668, Val669 and Leu670 
in Taq DNA polymerase I. Corresponding residues in other 
polymerases are also included and can be identified based 
on sequence homology or based on corresponding amino 
acids in structurally similar domains as defined by a 
crystallographic molecular model. 

The methods of the invention are also directed 
30 to altering residues immediately adjacent to the 
immutable or nearly immutable residues. Thus, the 
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and identifying those mutations which have an effect on 
the fidelity of DNA synthesis. 

The invention further provides methods for 
determining a fidelity of the active polymerase mutant. 
The fidelity of active polymerase mutants can be 
determined by several methods. The active polymerases 
can be, for example, screened for altered fidelity from 
crude extracts of bacterial cells grown from the viable 
colonies. Methods for determining fidelity of synthesis 
are disclosed herein (see Example III) . In one method, a 
primer extension assay is used with a biased ratio of 
nucleoside triphosphates consisting of only three of the 
nucleoside triphosphates. Elongation of the primer past 
template positions that are complementary to the deleted 
nucleoside triphosphate substrate in the reaction mixture 
results from errors in DNA synthesis. Processivity of 
high fidelity polymerases will terminate when they 
encounter a template nucleotide complementary to the 
missing nucleoside triphosphate whereas the low fidelity 
polymerases will be more likely to misincorporate a non- 
complementary nucleotide. The accuracy of incorporation 
for the primer extension assay can be measured by 
physical criteria such as by determining the size or the 
sequence of the extension product. This method is 
particularly suitable for screening for low fidelity 
mutants since increases in chain elongation are easily 
and rapidly quantitated. 

A second method for determining the fidelity of 
polymerase mutants employs a forward mutation assay. A 
template containing a single stranded gap in a reporter 
gene such as lacZ is used for the forward mutation assay. 
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expressing a thermostable DNA polymerase mutant. For 
determining low fidelity polymerase mutants, reactions 
are carried out in the presence of equimolar 
concentrations of each nucleoside triphosphate. For 
5 determining high fidelity polymerase mutants, the 

reaction is carried out with a biased pool of nucleoside 
triphosphates. Using a biased pool of nucleoside 
triphosphates results in incorporation of errors in the 
synthesized strand that are proportional to the ratio of 

10 non-complementary to complementary nucleoside 

triphosphates in the reaction. Therefore, the bias 
exaggerates the errors produced by the polymerases and 
facilitates the identification of high fidelity mutants. 
The fidelity of DNA synthesis is determined from the 

15 number of mutations produced in the reporter gene. 

Procedures other than those described above for 
identifying and characterizing the fidelity of a 
polymerase are known in the art and can be substituted 
for identifying high or low fidelity mutants. Those 
20 skilled in the art can determine which procedures are 
appropriate depending on the needs of a particular 
application . 

Also provided herein is an isolated 
thermostable polymerase mutant having altered fidelity. 

25 The polymerase mutant has one or more mutated amino acid 
residues in the active site O-helix of a thermostable 
polymerase. Additionally provided is an isolated 
thermostable polymerase mutant having altered fidelity. 
The polymerase mutant has one or more mutated amino acid 

30 residues adjacent to an immutable or nearly immutable 
amino acid residue in the active site O-helix of a 
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is adjacent to an amino acid residue corresponding to 
Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase. 

The invention also provides an isolated 
thermostable polymerase mutant having altered fidelity, 
5 where the polymerase has one or more mutated amino acid 
residues adjacent to an amino acid residue corresponding 
to Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase 
and the mutant is a high fidelity mutant. 

Using the methods of the invention, a number of 

10 mutants have been identified as having high fidelity of 
DNA synthesis. For example, polymerases having one or 
more single-base substitutions adjacent to Arg659, 
Lys663, Phe667, and Tyr671 in the nucleotide sequence of 
Tag DNA polymerase I have been identified. Specific 

15 examples of these high fidelity mutants include, for 
example, polymerases having the single substitutions 
Asn666Asp, Asn666Ile, Ile665Leu, Leu670Val, Arg660Tyr 
Arg660Ser, Gly668Arg, Arg660Lys, Gly668Ser and Gly668Gln; 
polymerases having the double substitutions consisting of 

20 Thr664Ile together with Asn666Asp, and Ala661Ser together 
with Val669Leu; as well as polymerases having the triple 
substitutions consisting of Thr664Pro, Ile665Val together 
with Asn666Tyr, and Ala661Glu, Ile665Thr together with 
Phe667Leu. Additional high fidelity mutants include, for 

25 example, Phe667Leu and Phe667Tyr. 

The invention provides a high fidelity 
polymerase mutant having one or more amino acid 
substitutions selected from the group consisting of 
Phe667Leu; Asn666Asp; Asn666lle; Ile665Leu; Leu670Val; 
30 Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; Gly668Ser; 
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Thr664Pro, Ile665Val and Asn666Tyr. The polymerase 
mutant Phe667Tyr has been previously described and is 
excluded from the compositions of the invention. 

The invention also provides an isolated 
thermostable polymerase mutant having altered fidelity, 
where the polymerase has one or more mutated amino acid 
residues adjacent to an amino acid residue corresponding 
to Arg659, Lys663, Phe667 or Tyr671 in Taq DNA polymerase 
and the mutant is a low fidelity mutant. The invention 
additionally provides a low fidelity polymerase mutant 
having one or more amino acid substitutions selected from 
the group consisting of Aia661Giu; Ala66iPro; Thr664Pro; 
Thr664Asn; Thr664Arg; Asn666Val; Thr664Pro and Val669Ile; 
Arg660Pro and Leu670Thr; Arg660Trp and Thr664Lys; 
Ala662Gly and Thr664Asn; Ala661Gly and Asn666Ile; 
Ala661Pro and Asn666Ile; and Ala661Ser, Ala662Gly, 
Thr664Ser and Asn666Ile. 

Low fidelity mutant DNA polymerases include 
mutations involving substitutions at Ala661, Thr664, 
Asn666, and Leu670. Specific examples of low fidelity 
mutants include, for example, polymerases having the 
single substitutions Ala661Glu, Ala661Pro, Thr664Pro, 
Thr664Asn, Thr664Arg and Asn666Val; polymerases having 
the double substitutions consisting of Thr664Pro together 
with Val669lle, Arg660Pro together with Leu670Thr, 
Arg660Trp together with Thr664Lys, Ala664Gly together 
with Thr664Asn, Ala661Gly together with Asn666lle, and 
Ala661Pro together with Asn666Ile; as well as polymerases 
having four substitutions consisting of Ala661Ser, 
Ala662Gly, Thr664Ser together with Asn666Ile. 
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polymerases other than Taq DNA polymerase having 
mutations at corresponding positions. In particular, the 
invention provides thermostable polymerases other than 
Taq DNA polymerase that have mutations at corresponding 
5 positions and that have altered fidelity. Those skilled 
in the art can determine corresponding positions based on 
sequence homology between the polymerases. 

The invention also provides an isolated nucleic 
acid molecule encoding a polymerase mutant having high 

10 fidelity. The nucleic acid molecule contains a 

nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I having one or more amino 
acid substitutions selected from the group consisting of 
Phe667Leu; Asn666Asp; Asn666Ile; Ile665Leu; Leu670Val; 

15 Arg660Tyr; Phe667Tyr; Arg660Ser; Gly668Arg; Arg660Lys; 

Gly668Ser; Gly668Gln; Thr664Ile and Asn666Asp; Ala661Ser 
and Val669Leu; Ala661Glu, Ile665Thr, and Phe667Leu; and 
Thr664Pro, Ile665Val and Asn666Tyr. 

Additionally provided is an isolated nucleic 
20 acid molecule encoding a polymerase mutant having low 
fidelity. The nucleic acid molecule contains a 
nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I having a substitution of 
one or more amino acids selected from the group 
25 consisting of Ala661, Thr664, Asn666 and Leu670. The 

invention also provides a polymerase mutant having one or 
more amino acid substitutions selected from the group 
consisting of Ala661Glu; Ala661Pro; Thr664Pro; Thr664Asn; 
Thr664Arg; Asn666Val; Thr664Pro and Val669lle; Arg660Pro 
30 and Leu670Thr; Arg660Trp and Thr664Lys; Ala664Gly and 
Thr664Asn; Ala661Gly and Asn666Ile; Ala661Pro and 
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The invention also provides methods for the 
identification of one or more mutations in a gene using 
the high fidelity mutant DNA polymerases of the 
invention. For example, the use of a high fidelity 
mutant to amplify a gene of interest gives greater 
confidence that the amplified sequence will more 
accurately reflect the actual sequence in the sample and 
minimizes the introduction of artifactual mutations 
during amplification of the gene. The higher accuracy of 
gene amplification provided by a high fidelity mutant 
also improves the identification of genetic mutations due 
to the increased confidence that observed mutations are 
more likely to reflect genetic mutations in the sampls 
rather than artifactual mutations introduced during 
amplification . 

Additionally, the invention provides methods 
for identifying one or more mutations in a gene by 
amplifying the gene using a high fidelity polymerase 
mutant under conditions which allow polymerase chain 
reaction amplification. The gene is amplified by 
exposing the strands of the gene to repeated cycles of 
denaturing, annealing and elongation to produce an 
amplified gene product. Methods for amplifying genes 
using PCR are well known to those skilled in the art and 
include those described previously in PCR Primer. A 
Laboratory Manual , Dieffenbach and Dveksler, eds . , Cold 
Spring Harbor Press, Plainview, New York (1995) . The 
presence or absence of one or more mutations in the gene 
can be determined by sequencing the amplified product 
using methods well known to those skilled in the art. 



The invention provides methods for accurately 
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polymerase mutant. The repetitive nucleotide sequence 
can be in a gene or in a microsatelli te between genes. 
The methods of amplifying the repetitive nucleotide 
sequences are carried out under conditions which allow 
PCR amplification with repeated cycles of denaturing, 
annealing and elongation as described above. 

The high fidelity mutants of the invention are 
advantageous for copying repetitive nucleotide sequences 
such as repetitive DNA because polymerases found in 
nature undergo slippage when copying DNA containing 
repetitive sequences. Therefore when polymerases found 
in nature are used, the amplification products of a 
nucleotide sequence containing a repetitive sequence do 
not accurately reflect the size or sequence of a DNA 
sequence in a sample. However, the use of a high 
fidelity polymerase mutant greatly increases the accuracy 
of an amplification product to reflect the actual size 
and sequence of the repetitive DNA sequence in the 
sample. Repetitive DNA can be found in microsatellites, 
which contain multiple repetitive nucleotide sequences 
and are dispersed throughout the genome. These 
repetitive di-, tri- and tet ranucleotides are frequently, 
but not invariably, located between genes. 

The invention also provides a method for 
determining an inherited mutation by amplifying a gene 
using a high fidelity polymerase mutant. Such an 
inherited mutation can be correlated with a genetic 
disease, thereby allowing diagnosis of the genetic 
disease. The invention additionally provides methods for 
diagnosing a genetic disease by amplifying a gene using a 
high fidelity polymerase mutant. A genetic disease is 
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mutation can be a somatic mutation or a germline 
mutation. The methods of the invention can be used to 
diagnose any genetic disease using high fidelity 
polymerase mutants. Such genetic diseases can involve 
point mutations, insertions and deletions. 

The methods of the invention employ high 
fidelity polymerase mutants and can similarly be used to 
diagnose genetic diseases involving repetitive DNA. In 
one embodiment, the genetic disease involves mutations in 
a microsatellite or repetitive DNA. Microsatellites are 
relatively stable in normal cells but are found to be 
unstable and to vary in length in some forms of 
hereditary and non-hereditary cancer, including 
hereditary nonpolyposis colorectal cancer (HNPCC) , other 
cancers that arise in HNPCC families, Muir-Torre syndrome 
and small-cell lung cancer (Loeb, Cancer Res. 54:5059- 
5063 (1994); Brentnall, Am. J. Pathol. 147:561-563 
(1995); Honchel et al., Semin. Cell Biol. 6:45-52 (1995); 
Eshleman and Markowitz, Curr. Opin. Oncol. 7:83-89 
(1995)). Microsatellite instability appears to be 
confined to tumors and is not present in normal tissues 
of affected individuals. 

The accuracy of amplification products of 
repetitive DNA sequences provided by the high fidelity 
mutants of the invention can be used to diagnose diseases 
involving mutations in repetitive DNA sequences. For 
example, with tumor samples, the accurate amplification 
of repetitive DNA sequences can be used to diagnose those 
cancers involving variable length in microsatellite DNA. 
Since microsatellite instability appears to be confined 
to tumors, amplification of repetitive DNA using the high 
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of a cancer patient, evaluating outcomes of therapy, 
staging tumors and determining tumor status. High 
fidelity mutants of the invention can also be applied to 
amplify DNA in blood samples to identify circulating 
cells containing microsatelli te instability as an 
indicator of a cancerous state. 

Other genetic diseases also involve repetitive 
DNA sequences, in particular, unstable triplet repeats. 
These unstable triplet repeat diseases involve increasing 
lengths of triplet repeat regions, ranging from -50 
repeats in normal individuals, -200 repeats in carriers 
to ""2000 repeats in affected individuals Such unstab 1 ^ 
triplet repeat diseases include, for example, fragile X 
syndrome, spinal and bulbar muscular atrophy, myotonic 
dystrophy, Huntington's disease, spinocereballar ataxia 
type 1, fragile X E mild mental retardation and 
dentatorubral pallidoluysian atrophy (Monckton and 
Caskey, Circulation 91:513-520 (1995)). The diagnosis of 
unstable triplet repeat diseases is particularly valuable 
since the onset of symptoms can occur later in some 
diseases and the severity of the symptoms of some 
diseases can be correlated with the size of the extended 
triplet repeat region. Thus, amplification of these 
triplet repeat regions to more accurately reflect the 
actual size of the triplet repeat in the individual 
provides more accurate diagnosis and prognosis of the 
disease. Amplification of the large expanded regions 
associated with triplet repeat diseases can be carried 
out using low fidelity polymerase mutants of the 
invention since low fidelity polymerase mutants would be 
more likely to copy through very long stretches of 
repetitive nucleotide sequences. 
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One method for identifying a genetic disease 
involves utilization of primers that hybridize to 
specific genes. The primers contain 3' -terminal 
nucleotides complementary to the corresponding nucleotide 
in the mutant but not to the wild type gene. The 
mismatched primer is used to extend the primer template 
in the presence of a high fidelity mutant polymerase. 
The presence of an extension product is indicative of a 
mutant gene. 

The mismatch PCR method is based on the fact 
that a PCR primer that is not complementary to the 
t e mp late at the 3 ' end is an inefficient substrate for 
polymerases such as Taq DNA polymerase I. Wild type Taq 
DNA polymerase will occasionally misextend a mismatched 
primer, resulting in a false positive in an assay for a 
gene mutation. For example, a mutant gene with a rare TT 
mutation would be difficult to specifically amplify out 
of a pool of DNA molecules containing a wild type CC at 
the position of the TT mutant because wild type Taq DNA 
polymerase would occasionally misextend the wild type 
gene using the mismatched primer. In contrast, a high 
fidelity polymerase would not extend the mismatched 
primer. The products of a high fidelity polymerase in 
the mismatch PCR assay would therefore correspond to the 
mutant gene and would have fewer false positives than 
that observed with wild type Taq DNA polymerase. Thus, 
the more discriminating assay based on the use of high 
fidelity polymerases results in a better assay for 
detecting somatic mutations. The use of high fidelity 
mutants in such a mismatch-PCR based assay is disclosed 
herein (see Example V). 
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the low fidelity polymerase mutants of the invention. 
The low fidelity polymerase mutants exhibit an efficiency 
of accurate base incorporation that is less than that of 
wild type polymerases. The efficiency of the low 
fidelity polymerase mutant is about 50% or more, 
generally 10% or more, and particularly 1% or more than 
that of a wild type polymerase. These low fidelity 
polymerase mutants would therefore exhibit between 2-fold 
to 100-fold lower fidelity than wild type polymerase. 
The introduction of mutations into specific genes using 
low fidelity polymerase mutants of the invention is 
useful for determining the effects of mutations on the 
function ot those gene products. 

It is understood that modifications which do 
not substantially affect the activity of the various 
embodiments of this invention are also included within 
the definition of the invention provided herein. 
Accordingly, the following examples are intended to 
illustrate but not limit the present invention. 

EXAMPLE I 

Ranflpm Sequence Mutagenesis and Identification of Active 
Tag DNA Polymer ase Mutants 

This example demonstrates random nucleotide 
sequence mutagenesis of a polymerase target sequence and 
identification of active polymerase mutants. 

Random sequence mutagenesis was used to 
introduce mutations into the 0-helix of Tag DNA 
polymerase. Briefly, the Taq DNA polymerase I gene was 

obtained from ^he ba^^^rial oh r on o some bv c.lonina in 
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fragment containing the Taq DNA polymerase I gene, 
including the 5 ! -3 f exonuclease domain and the tac 
promoter region, was further transferred into the Sail 
site of pHSG576 (pTacTaq) . The Taq DNA polymerase I gene 
was sequenced to confirm wild type sequence except for 
the lack of the N-terminal three amino acids. 



A vector containing a nonfunctional insert 
within the Taq DNA polymerase 1 gene was constructed and 
subsequently replaced with an oligonucleotide containing 
the random sequence to avoid contamination with 
incompletely cut vectors. To generate the nonfunctional 
vector, a Sacll site was produced using sire-directed 
mutagenesis by changing 2070C to G using a synthetic 
oligomer, 5 ' -GGG TCC ACG GCC TCC CGC GGG ACG CCG AAC ATC 
CAG CTG {SEQ ID N0:3) (SacII-2) and the single-stranded 
plasmid pFC85 (Kunkel, Proc. Natl. Acad. Sci. USA 82:488- 
492 (1985)). The BstXl-Nhel fragment that carries the 
SacII site was substituted for the corresponding fragment 
in pTacTaq (pTacTaqSac) . A SacII-Nhel fragment in 
pTacTaqSac was further replaced with the synthetic 
oligomer 5 1 -GGA CTG CAT ATG ACT G (SEQ ID NO: 4) (DUM-U) 
hybridized with 5'-CTA GCA GTC ATA TGC AGT CCG C 
(SEQ ID N0:5) (DUM-D) to create the nonfunctional vector 
(Dube et al., Biochemistry 30:11760-11767 (1991)). 

Oligonucleotides containing 9% random sequence, 
in which each nucleotide indicated in parentheses was 91% 
wild type nucleotide and 3% each of the other three 
nucleotides, were synthesized by Keystone Laboratories 
(Menlo Park, CA) : 0+9 RANDOM is 5 1 -CGG GAG GCC GTG GAC 
CCC CTG ATG (CGC CGG GCG GCC AAG ACC ATC AAC TTC GGG GTC 
CTC TAC) GGC ATG TCG GCC CAC CG (SEQ ID NO: 6); O-O RANDOM 
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ends of the two oligonucleotides are complementary. 
Equimolar amounts of these oligonucleotides (20 pmol) 
were mixed, hybridized, and extended by five cycles of 
PCR reaction (94°C for 30 sec, 57°C for 30 sec, and 72°C 
5 for 30 sec) in a 100 pi reaction mixture containing 10 mM 
Tris-HCl (pH 8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.001% 
gelatin, 50 uM dNTPs, and 2.5 units of Taq DNA polymerase 
I. This PCR product (10 pi) was further amplified 25 
cycles with 20 pmol of 0(+) PRIMER (5'-TTC GGC GTC CCG CGG 

10 GAG GCC GTG GAC CCC CT ) (SEQ ID NO: 8) and 20 pmol of 
O(-) PRIMER ( 5 1 -GTA AGG GAT GGC TAG CTC CTG 
GGA) (SEQ ID NO: 9) under the same conditions. The 
amplified product was purified by phenul/chlorof orm 
extraction followed by ethanol precipitation and 

15 digestion with the restriction enzymes, SacII and Nhel, 
at 37°C for 30 min in 50 mM Tris-HCl (pH 7.9), 50 mM 
NaCl, 10 mM MgCl 2 and 1 mM dithiothrei tol . The 
restriction fragment containing the random sequence was 
purified by phenol/chloroform extraction, ethanol 

20 precipitation, and filtration using a Microcon 30 filter 
(Amicon, Beverly, MA) . For the totally random library, 
five oligonucleotides (80-mers) , each having totally 
random sequence at one of the codons 659, 660, 663, 667 
or 668, were combined in equal amounts and hybridized to 

25 O-O RANDOM. After extension and digestion with 

endonucleases , the combined products were purified and 
processed as above. 

A random library of Tag DNA polymerase genes 
containing randomized nucleotide sequence corresponding 
30 to the O-helix was generated by digesting the vector 

containing the nonfunctional insert with Nhel and SacII 
restriction endonucleases. The large DNA fragment was 
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large fragment, lacking the nonfunctional insert, was 
ligated with an oligonucleotide containing randomized 
sequence by incubating overnight at 16°C with T4 DNA 
ligase. The ligation mixture was then used to transform 
5 DH5a by electroporation according to Bio-Rad (Hercules, 
CA) . After electroporation, 1 ml of SOC (2% 
bactotryptone/O . 5% yeast extract/10 mM NaCl/2.5 mM KC1/10 
mM MgCl 2 /10 mM MgSO 4 /20 mM glucose) was added and 
incubation continued for 1 h at 37°C. An aliquot was 

10 plated on 2xYT (16 g/liter tryptone, 10 g/liter yeast 
extract, 5 g/liter NaCl, pH 7.3) containing 30 pg/ml 
chloramphenicol to determine the total number of 
t ictus foriuants , and the remainder was inoculated into 500 
ml of 2xYT containing 30 pg/ml chloramphenicol and 

15 cultured at 37°C overnight. Plasmids (random library 
vector) were purified and used for transformation of 
recA718 polA12 strain. 



For genetic complementation to determine active 
polymerase mutants, E. coli recA719 polA12 cells (SC18-12 

20 E. coli B/r strain, which has the genotype recA718 polA12 
uvrA155 trpE65 lon-11 sulAl) were transformed with 
plasmids pHSG576 or pTacTaq by electroporation (Bio-Rad 
Genepulser, 2kV, 25 pFD, 400 Q) (Sweasy and Loeb, supra, 
(1992); Sweasy and Loeb, Proc. Natl. Acad. Sci . USA 

25 90:4626-4630 (1993); Witkin and Roegner-Maniscalo, J, 
Bacteriol. 174:4166-4168 (1992)). Thereafter, 1 ml of 
nutrient broth (NB) (8 g/liter) containing NaCl 
(4 g/liter) and 1 mM isopropyl p-D-thiogalactoside (IPTG) 
was added and the mixture was incubated for 1 h at 37°C. 

30 The transformed cells were plated on nutrient agar plates 
(containing 23 g/liter Difco nutrient agar, 5 g/liter 
NaCl, 30 pg/ml chloramphenicol, 12.5 pg/ml tetracycline 
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phase at 30°C. Thereafter, -10 pi (10 4 cells) was 
introduced at the center of an agar plate, and the 
inoculation loop was gradually moved from the center to 
the periphery as the plate was rotated. Duplicate plates 
5 were incubated at 30°C or 37°C for 30 h. To determine 

complementation efficiency by Taq DNA polymerase I and to 
isolate mutants, cultures of the recA718 polA12 strain 
harboring either pHSG576 or Taq DNA polymerase I were 
diluted with NB medium and plated (-500 colonies per 

10 plate) . Duplicate plates were incubated at 30°C or 37°C, 
and visible colonies were counted after a 30 h 
incubation. Complementation was verified by a second 
round of electroporation and colony formation at the 
nonpermissive temperature. Cell-free extracts were 

15 prepared from selected colonies obtained at the 

restrictive temperature and assayed to confirm that they 
contained a temperature-resistant DNA polymerase activity 
(Lawyer et al., J. Biol. Chem. 264:6427-6437 (1989)). 



Wild type Taq DNA polymerase I was tested for 
20 its ability to complement a temperature sensitive 
polymerase contained in the E. coli strain recA718 
polA12, which is unable to grow at 37°C in rich media at 
low cell density (Witkin and Roegner-Maniscalo, 1992, 
supra) . The temperature sensitive phenotype of E. coli 
25 strain recA718 polA12 was complemented by transformation 
with the pTacTaq plasmid encoding wild type Taq DNA 
polymerase I as indicated by growth at 37°C. Therefore, 
this E. coli strain containing a temperature sensitive 
polymerase provides a good model system for testing Taq 
30 DNA polymerase I mutants. 



To evaluate the involvement of different amino 



WO 98/23733 PCT/US97/21940 

38 

random sequences were substituted for nucleotides 
encoding a portion of the substrate binding site of Taq 
DNA polymerase I (O-helix, amino acids Arg659 through 
Tyr671) . The substituted stretch was 39 nucleotides long 
with 9% randomization. At each position the proportion 
of the wild type residue was 91% and the other 3 
nucleotides were present in equal amounts (3% each) . 

A library of 50,000 independent mutants was 
obtained. The number of colonies obtained at 37°C was 
11.8% of that obtained at 30°C. Therefore, screening a 
randomized library using E. coli strain recAlld polA12 

piUV J.UCU Q^^lUAllIiaLClJ/ uy\JV LUlUlilCO L a lilJ. liy CIL LI V c 

Taq DNA polymerase and potential polymerase mutants. 

These results show that a randomized library 
can be used to generate a population of polymerase 
mutants. These results also show the identification of 
active Taq DNA polymerase I mutants by screening for 
active polymerase mutants using genetic selection. 

EXAMPLE II 

Identification of Tag DNA Polymerase I Mutants and 
Immutable or Nea rly Immutable Amino Acid Residues 

This example describes the identification Tag 
DNA polymerase I mutants generated by a randomized 
library and the identification of immutable or nearly 
immutable amino acid residues. 

The active Taq DNA polymerase I mutants 
identified by the screen described in Example I were 
further characterized. The entire random nucleotide- 



WO 98/23733 PCT/US97/21940 

39 

plasmids obtained at 37°C (positively selected), 16 
plasmids obtained at 30°C (nonselected) and 29 plasmids 
obtained at 30°C, which failed to grow at 37°C (negatively 
selected) . All substitutions were in the randomized 
nucleotides except for 12 clones. 

Among the 230 positive plasmids, 168 contained 
silent mutations in one or more codons. At the amino 
acid level, 106 encoded the wild type residue and 124 
encoded substitutions, in accord with the expected 
distribution in the plasmid population. Of the 124 
plasmids with amino acid changes, 40 were unique mutants 
obtained just once. The l eiiidininy 84 plasmids 
represented 21 different mutants. At least 79% of those 
encoding the same amino acid substitutions were 
independently derived since they contained different 
silent mutations in other codons. In total, 61 different 
amino acid sequences were obtained that complemented the 
temperature-sensitive phenotype of the recA718 polA12 
host • 

A compilation of the amino acid substitutions 
found in Tag DNA polymerase I is shown in Figure 2, 
Solid boxes indicate the amino acid residues for which no 
substitutions were detected. Dashed boxes mark the amino 
acid positions where only conservative substitutions were 
found. The amino acid positions of Taq DNA polymerase I 
and corresponding positions of E. coli DNA polymerase I 
are indicated at the top. WT represents the wild type 
sequence and randomized amino acids are written in 
boldface type. The amino acids that have not been found 
in the DNA polymerase I family are outlined (Braithwaite 
and Ito, Nucleic Acids Res. 21:787-802 (1993)). Panel A 
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the sequence of each multiply substituted mutant selected 
from the 9% library. Panel C shows mutations selected 
from the totally random library. 

The distribution of single amino acid 
5 substitutions among the active mutants was not random 
(see Figure 2A) . For example, numerous diverse 
substitutions were observed at Ala661 and Thr664 . In 
contrast, no substitutions were detected at five 
positions (Arg659, Arg660, Lys663, Phe667 and Gly668). 

10 This uneven distribution of replacements is unlikely to 
be the result of a bias in the nucleotide composition of 
the random insert since sequencing of both the 
nonselected and negatively selected plasmids revealed 
multiple nucleotide substitutions at each of the targeted 

15 positions and because silent mutations were detected at 
each of these positions in the selected clones, 

A nonrandom distribution of substitutions was 
also observed among active mutants containing multiple 
substitutions (see Figure 2B) . Again, Ala661 and Thr664 

20 were replaced with a variety of residues. However, no 
amino acid substitutions were observed in place of 
Arg659, Lys663 and Gly668, even though different silent 
nucleotide substitutions were found at each of these 
positions. A comparison of Figure 2A and B shows that 

25 substitutions at Arg660 and Phe667 occur only in the 
presence of substitutions at other positions. In 
addition to the mutants containing multiple substitutions 
shown in Figure 2B, two additional triple mutants were 
also found: mutant 44, with Ala661Pro, Thr664Arg, and 

30 Val669Leu; and mutant 54, with Ala661Thr, Thr664Pro and 
Ile665Val . 
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The partially substituted library (9%) does not 
provide a vigorous test of the immutability of specific 
codons. Only 0.07% of sequences at each codon would be 
expected to contain nucleotide substitutions at all three 
positions. To further probe the mutability of specific 
amino acid residues, a second library was constructed 
that contained totally random substitutions at a limited 
number of designated codons. In this library, 
nucleotides encoding each of the five amino acids Arg659, 
Arg660, Lys663, Phe667 and Gly668 were randomized. These 
were amino acid positions that did not yield single 
substitutions in the 9% random library (Figure 2A) . 

7l««^^„^ m^f «1 1 "} Pi H 4- v- »^ v-tv, « 4- ^ A +--I mrt r- „ „ - „ 

than the number required for each possible substitution 
at each of the target codons, were screened. At the 
nonpermissive temperature, 113 colonies were obtained, 84 
of which contained codons that encoded the wild type 
amino acid sequence. Most of the amino acid 
substitutions occurred in place of Arg660 or Gly668. 

Again, Arg659 and Lys663 were completely 
conserved, with 16 and 5 silent mutations scored at these 
codons, respectively. The expected number of silent 
mutations were 21 and 4.2, respectively, assuming that 
the 5 randomized oligomers that comprised the library 
were mixed in equimolar proportions. These numbers show 
that the oligomers were roughly equally represented in 
the library and that sufficient mutants were sampled to 
conclude that Arg659 and Lys663 are immutable in these 
genetic complementation experiments (P < 0.05 for Met and 
Trp, P < 0.01 for all other substitutions). Only Tyr 
substituted for Phe at position 667 (Figure 2C) , and six 
silent mutations were scored for this codon. An 
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library but not shown in Figure 2 is mutant 601, with 
double substitutions Ile665Asn and Val669Ile. 



These results show that generating a random 
library and screening by genetic complementation provided 
5 a number of active Taq DNA polymerase I mutants. These 
results also show that amino acid residues Arg659 and 
Lys663 were found to be immutable and Phe667 and Tyr671 
were found to tolerate only conservative substitutions. 



EXAMPLE III 

10 Determination of the Fidelity of Active Tag DNA 

Polymerase I Mutants 



This example describes methods of determining 
the fidelity of active Taq DNA polymerase I mutants. Two 
types of assays are useful for determining the fidelity 
15 of active polymerase mutants, a primer extension assay 
and a forward mutation assay. 



Crude extracts were used to determine the 
fidelity of polymerase mutants. A single colony of 
E. coli DH5a (F, <p80dlacZAM15 , A ( lacZYA-argF) Ul 69 , deoR, 

20 recAl, endAl , phoA, hsdRll {r k ~m k *) , supE44, A', thi-1, 
gyrA96, relAl) carrying wild type or mutant Taq DNA 
polymerase I was inoculated into 40 ml of 2xYT 
(16 g/liter tryptone, 10 g/liter yeast extract, 5 g/liter 
NaCl, pH 7.3) containing 30 mg/liter chloramphenicol. 

25 After incubation at 37°C overnight with vigorous shaking, 
an equal amount of fresh medium with 0 . 5 mM IPTG was 
added, and incubation was continued for 4 h. Cells were 
harvested, washed once with TE buffer (10 mM Tris-HCl, 
dH 8.0, 1 mM EDTA ) and suspended in 100 ul of buffer A 
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fluoride, 1 mM dithiothreitol , 0.5 mg/liter leupeptin, 
1 mM EDTA, 250 mM KC1) . Bacteria were lysed by 
incubating with lysozyme (0.2 mg/ml) at 0°C for 2 h. The 
lysate was centrifuged at 15,000 rpm (Sorvall, SA-600 
5 rotor) (DuPont, Newtown, CT) for 15 min, and the 

supernatant solution was incubated at 72°C for 20 min. 
Insoluble material was removed by centrif ugation . 



Polymerases were purified as described 
previously with some modifications (Lawyer et al., PCR 
10 Methods Application 2:275-287 (1993). Briefly, a single 
colony of E. coli DH5a carrying wild type or mutant Taq 

HM7\ »-> r\ 1 * trr> /-^ > , o ^ T . .r r> < ^ , , "| 4- „ ,J "I r\ — 1 ~ -P O . . vm m . . _ 
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ml of the inoculum was immediately added to each of 5 
bottles containing 1 liter of 2xYT with 30 mg/liter 

15 chloramphenicol. After overnight incubation at 37°C with 
vigorous shaking, 1 liter of 2xYT containing 30 mg/liter 
chloramphenicol and 0.5 mM IPTG was added, and incubation 
was continued for 4 h. Cells were harvested, washed once 
with TE buffer and suspended in 100 ml buffer A. 

20 Bacteria were lysed by incubating with lysozyme 

{0.2 mg/ml) at 0°C for 2 h and then sonicating on ice for 
45 sec by using a micro-tip probe (Sonifier, Branson 
Sonic Power, Danbury, CT) . 



The lysate was centrifuged at 15,000 rpm 
25 (Sorvall, SA-600 rotor) for 15 min, and the supernatant 
solution was incubated at 72°C for 20 min. Insoluble 
material was removed by centrif ugation . Ammonium sulfate 
(0.2 M) and Polymin P (0.6%) were added and the 
suspension was held on ice for 1 h. After removal of the 
30 precipitate by centrif ugation and filtration through a 
Costar 8310 filter, the filtrate was applied to a 
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sulfate and 0.01% Triton X-100. The column was washed 
with the same buffer (300 ml) and activity was eluted 
with buffer B (TE buffer containing 0.01% Triton X-100 
and 50 mM KC1) . The eluate (100 ml) was dialyzed 
5 overnight against 4 liters of buffer B and loaded onto a 
0.8 x 8-cm heparin-SEPHAROSE CL6B (Pharmacia Biotech) 
column equilibrated with buffer B. After washing with 
buffer B (50 ml), activity was eluted in a 30 ml linear 
gradient of 50-500 mM KC1 in TE buffer containing 0.01% 
10 Triton X-100. Active fractions were collected, dialyzed 
against 50 mM Tris-HCl (pH 8.0) containing 50 mM KC1 and 
50% glycerol, and stored at -80°C. 



To confirm and quantitate the presence of 
polymerase activity, crude extracts or purified enzyme 

15 was incubated at 72°C for 5 min in 50 mM Tris-HCl 

(pH 8.0), 2 mM MgCl 2 , 100 pM each dATP, dGTP, dCTP and 
dTTP, 0.2 \iCi of ( 3 H) dATP and 200 pg/ml activated calf 
thymus DNA. Incorporation of radioactivity into an acid- 
insoluble product was measured according to Battula and 

20 Loeb ( J. Biol. Chem. 24 9:408 6-4 093 (1974). One unit 
represents incorporation of 10 nmol of dNMP in 1 h, 
corresponding to 0.1 unit as defined by Perkin-Elmer . 



For the primer extension assay, the 14-mer 
primer 5 1 -CGCGCCGAATTCCC (SEQ ID NO: 10) was 32 P-labeled at 
25 the 5' end by incubation with (y- 32 P)ATP and T4 

polynucleotide kinase and annealed to an equimolar amount 
of the template 4 6-mer 

5 1 -GCGCGGAAGCTTGGCTGCAGAATATTGCTAGCGGGAATTCGGCGCG 
(SEQ ID NO:ll). Heat-inactivated E. coli extracts 
30 containing 0.3-1 unit of wild type or mutant Taq DNA 
polymerases were incubated at 45°C for 60 min in 50 mM 
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primer. A set of four additional reactions, each lacking 
a different dNTP, was carried out for each polymerase. 
Purified enzyme (1 unit) was incubated for the times 
indicated under the same conditions as for crude 
5 extracts. After electrophoresis in a 14% polyacrylamide 
gel containing 8M urea, reaction products were analyzed 
by autoradiography. Extension was quantified by using an 
NIH imaging program (see http//www . nih . gov/ ) . 

For the forward mutation assay, the non-coding 
10 strand of the lacZa gene contained in 200 ng of gapped 
M13mp2 DNA was copied by using 5 units of wild type or 

milt" ant- Tj3 rr HMZi ha! ^ rrr\r\ v n e> ^ T -in v-/^-»^4--i^^. .-v, 
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containing 50 mM Tris-HCl (pH 8.0), 2 mM MgCl 2 and 50 mM 
KC1 (Feig et al . Proc. Natl. Acad. Sci . USA 91:6609-6613 

15 (1994)). For determining low fidelity polymerase 

mutants, the reaction included 20 pM each dNTP. For 
determining high fidelity polymerase mutants, the 
reaction was carried out with biased dNTP pools 
containing 0.5 mM of one dNTP and 20 mM of each of the 

20 other three dNTPs. For example, the reaction could 
contain 0 . 5 mM dATP and 20 mM each of dGTP, dCTP and 
dTTP . After incubation at 72°C for 5 min, the DNA was 
transfected into host E. coli and the plaques were scored 
for white and pale b^ue mutant plaques (Tindall et al., 

25 Genetics 118:551-560 (1988)). 



These results show that the fidelity of active 
Taq DNA polymerase mutants can be determined using a 
primer extension assay and a forward mutation assay. 
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EXAMPLE IV 

Identification of Low Fidelity Tag DNA Polymerase I 

Mutants 

This example shows the identification of low 
5 fidelity Taq DNA polymerase I mutants. 

The active Taq DNA polymerase I mutants 
identified in Example II were assayed by the methods 
described in Example III to identify low fidelity 
mutants. Screening for activity was carried out on 67 of 

10 75 sequenced mutants, including all 38 with single amino 
acid substitutions described in Figure 2. Plasmids 
encoding the mutant polymerases were cloned, purified and 
grown in E. coli, and host cells were analyzed for 
expression of Taq DNA polymerase I by measuring the 

15 activity of crude extracts. E. coli DNA polymerases and 
nucleases were inactivated by heating at 72°C for 20 min. 
The ability of heat-treated extracts to elongate primers 
in the absence of a complete complement of four dNTPs was 
then determined using a set of five reactions. One 

20 reaction contained all four complementary nucleoside 

triphosphates while each of the others lacked a different 
dNTP ("minus conditions") . Elongation in the minus 
reactions is limited by the rate of misincorporation at 
template positions complementary to the missing dNTP. 

25 A primer extension assay was performed on wild 

type Taq DNA polymerase I and several mutants, revealing 
that several mutants had elongation patterns that 
differed from wild type Taq DNA polymerase. In the 
presence of all four dNTPs, every extract examined 

30 extended more than 90% of the hybridized primer to a 
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the minus reactions, wild type Tag DNA polymerase I 
extended 48-60% of the primer up to, but not opposite, 
the first template position complementary to the missing 
dNTP. The remaining primer was terminated opposite the 
missing dNTP, presumably by incorporation of a single 
non-complementary nucleotide, or was terminated further 
downstream, presumably by extension of the mispaired 
primer terminus. A variety of elongation patterns was 
observed for the 67 mutants. Thirteen mutants extended 
more of the primer and/or synthesized a greater 
proportion of longer products than the wild type enzyme 
in three or four of the minus reactions. For example, 
mutant 2 formed full length products in reactions lacking 
dGTP or dTTP. This increased extension presumably 
reflects increased incorporation and/or extension of 
non-complementary nucleotides. Other mutants extended 
less of the primer or synthesized shorter products than 
the wild type enzyme, for example, mutant 5. In several 
cases, different amino acid substitutions at the same 
position either increased or decreased extension in 
comparable minus reactions. 

A compilation of amino acid replacements in the 
13 mutants that displayed increased extension in at least 
three of the minus reactions is shown in Table I. With 
the exception of Gly668, one or more substitutions that 
putatively reduce the accuracy of DNA synthesis were 
observed for each of the 9 non-conserved amino acids. 
Eleven mutants harbored substitutions at either Ala661 or 
Thr664, including several single mutants. This initial 
screen with crude extracts suggested that a large number 
of changes are permitted in the O-helix that do not 
reduce the ability of Taq DNA polymerase I to complement 



WO 98/23733 PCT/US97/21940 

48 

Table I . Low Fidelity Mutants of Taq DNA Polymerase I 
Identified in the Primer Extension Screen 



659 663 667 671 

WT: RRAAKTINFGVLY 

5 



29 
36 
40 
45 
53 
130 
156 
175 
206 
240 
247 
248: 
306 



S G 



P 
N 

S 
K 
R 
N 



I 

V 
I 



substitutions in the 0-helix that do not reduce the 
20 ability of Taq DNA polymerase I to carry out functional 
complementation reduce the fidelity of DNA synthesis in 
vitro. 



To demonstrate that the reduction in fidelity 
exhibited by crude extracts is due to mutant Taq DNA 

25 polymerase I, wild type enzyme was purified as well as 
the three single mutants Ala661Glu, Ala661Pro and 
Thr664Arg. The mutant Ile665Thr, a mutant predicted to 
have no alteration in fidelity based on complementation 
assays, was also purified as a control. The mutated 

30 enzymes retained at least 29% of wild type activity in 
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temperature-sensitive host DNA polymerase I and ensures 
that analysis of fidelity will not be complicated by 
major impairments of catalytic efficiency. 

Primer extension assays were carried out with 
5 the homogenous mutant polymerases. Wild type Taq DNA 
polymerase I extended most of the primer to one 
nucleotide before the template position opposite the 
missing complementary dNTP in a 5 min reaction. Only 
about 30% of the primers were elongated further. In 
10 reactions containing equivalent activity, the mutant 

polymerases Ala661Glu, Thr664Arg and Ala661Pro extended a 
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wild type polymerase ceased synthesis. The control 
enzyme Ile665Thr yielded an elongation pattern similar to 

15 that of the wild type enzyme. Elongation reactions with 
the three polymerases were also carried out for 60 min. 
Again, Ala661Glu and Thr664Arg synthesized a greater 
proportion of longer products than obtained with the wild 
type and Ile665Thr polymerases. Notably, Ala661Glu, 

20 Thr664Arg and Ala661Pro synthesized longer products in 
5 min than the wild type did in 60 min. 

To further analyze the reduced fidelity 
exhibited by the low fidelity polymerase mutants, a time 
course of primer elongation was carried out. Wild type 

25 Tag DNA polymerase I extended 9% of the primers past the 
first deoxyguanosine template residue within the 60 min 
incubation period, but elongation past the second 
deoxyguanosine was not detected. In the same interval, 
Thr664Arg extended 93% of the primer past the first 

30 template deoxyguanosine, and elongation proceeded past as 
many as five template deoxyguanosines . Importantly, a 
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the products. These time course data indicate that 
greater elongation reflects increased ability to utilize 
non-complementary substrates and primer termini, rather 
than a putative difference in the amount of activity 
5 present. 

In a forward mutation assay, the fidelity of 
DNA synthesis by the purified polymerases was quantitated 
by measuring the frequency of mutations produced by 
copying a biologically active template in vitro (Kunkel 

10 and Loeb, J. Biol. Chem 254:5718-5725 (1979)). The 
target sequence was the lacZa gene located within a 
single-stranded region in gapped circular double-stranded 
M13mp2 DNA (Feig and Loeb, Biochemistry 32:4466-4473 
(1993)). The gapped segment was filled by synthesis with 

15 the wild type or mutant enzymes. The double-stranded 
circular product was transfected into E. coli, and the 
mutation frequency was determined by scoring white and 
pale blue mutant plaques. A comparison of the specific 
activities and mutation frequencies of the purified 

20 enzymes is presented in Table II. After synthesis by 
wild type Taq DNA polymerase I, the mutation frequency 
was not greater than that of the uncopied control. 
Synthesis by Ala661Glu and Thr664Arg gave rise to 
mutation frequencies more than 7- and 25-fold greater, 

25 respectively, than that of the wild type polymerase. 

A sample of independent, randomly chosen 
mutants produced by Thr664Arg was characterized by DNA 
sequence analysis using a THERMO SEQUENASE cycle 
sequencing kit (Amersham Life Science, Cleveland, OH) . 
30 Both base substitutions and frameshifts were found 
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Table II. Mutation Frequency in the lacZa Forward 
Mutation Assay 



Taq Pol I Specific 
Activity 



Plagues Scored 

Total Mutant 



Mutation 
Frequency 



WT 

A661E 
T664R 



units/mg 
66, 000 
45, 000 
23, 000 



8, 637 
6,782 
5,148 



22 

116 

324 



xlO' 3 
2.5 
17.1 
62.9 



10 throughout the targeted lacZa gene and its regulatory 
sequence. Of the 64 independent plaques, 57 had 
mutations in the target. Other mutations presumably 
occurred outside the target region. Some had more than 
one base substitution and a total of 66 mutations were 

15 observed (see Figure 3) . Among them, 61 were base 

substitutions. Transitions (38/61) were more frequent 
than transversions (23/61) . T - C transitions accounted 
for 31 of 61 base substitutions, while T - A (9/61), A - 
T (8/61) and G - A (5/61) substitutions were less 

20 frequent. This base substitution spectrum is essentially 
the same as that reported for wild type Tag DNA 
polymerase I (Tindall and Kunkel, supra, 1988). From 
these data, the base substitution fidelity of Thr664Arg 
can be calculated as 8.6 x 10" 4 or 1 error per 1200 

25 nucleotides. On the basis of the five frameshift mutants 
detected, the frameshift error can be calculated as 4.9 x 
10" 5 or 1 error per 20,000 nucleotides. 



These results show that low fidelity Taq DNA 
polymerase I mutants were identified from a randomized 
30 library using a genetic complementation screen. The 

fidelity of Taq DNA polymerase T mutants was determined 
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EXAMPLE V 

Identification of High Fidelity Tag DNA Polymerase I 

tftitantg 

This example shows the identification of high 
5 fidelity Taq DNA polymerase I mutants. 

The active Taq DNA polymerase I mutants 
identified in Example II were assayed by the methods 
described in Example III to identify high fidelity 

Table III. Candidate High Fidelity Mutants of 
0 Tag DNA Polymerase I 



659 663 667 671 

WT: RRAAKTINFGVLY 



FL : L 

15 74 : E T L 

146 : D 

147 : I 
149 : ID 

169 : S L 

20 186 : L 

219 : P V Y 

254 : V 

407 : Y 

424 : Y 

25 426 : S 

487 : R 

488 : K 

530 : S 

614 : Q 
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mutants. A panel of 75 active polymerases was screened. 
Candidate high fidelity polymerase mutants are shown in 
Table III. 

Thirteen of the active polymerases exhibited greater 
5 accuracy in DNA synthesis. Table IV summarizes the 

results of a forward mutation assay of some of these high 
fidelity mutants. Several polymerase mutants displayed 
higher fidelity than the wild type Taq DNA polymerase. 
Polymerase mutants exhibiting particularly high fidelity 
10 are mutant 424, with Phe667Tyr, mutant 426, with 
Arg660Ser and mutant 488, with Arg660Lys. 

Table IV. Fidelity of Taq DNA Polymerase Mutants in a 
la.cZ Forward Mutation Assay 



Enzyme Total Mutant Mutation 

15 Plaques Plaques Frequency 

Kl0~ 3 

Wild Type 5680 49 8 .6 

High Fidelity Mutants 

20 MS147 7249 47 6.5 

MS169 7275 34 5.1 

MS254 6898 40 5.8 

MS424 4810 14 2.7 

MS426 5727 23 4.1 

25 MS488 3442 13 1.5 

Low Fidelity Mutant 

MS206 3333 133 40 
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These results show that Taq DNA polymerase 
mutants were identified and found to exhibit higher 
fidelity than wild type Taq DNA polymerase. 



EXAMPLE VI 

5 High Fidelity Taq DNA Polymerase Mutants Enhance the 

Sensitivity of Mismatch PCR-based Assays for Somatic 

Mutations 



This example shows the use of high fidelity 
mutants obtained by mutating the active site O-helix of 
10 Taq DNA polymerase I to enhance the sensitivity of 
mismatch PCR-based assays for somatic mutations. 



Mismatch PCR is the basis of allele-specif ic 
identification of inherited mutations within genes and 
somatic mutations that occur in tumors. In these 

15 studies, one compares the extension of a correctly 

matched primer with the lack of extension using a primer 
with a 3 '-terminal mismatch. The rate of extension by 
DNA polymerase using a primer with a single mismatch 
compared to a primer with a 3 1 -complementary base pair 

20 (matched) terminus is approximately 10~ 5 (Perinno and 

Loeb, J. Biol. Chem. 2 62:28 98-2 905 (1989)). Elongation 
from a double mismatch is even less frequent, and thus 
offers an even more stringent test of the inability of 
mutant Tag DNA polymerases to elongate a mismatched 

25 primer terminus. 



A template containing the wild type sequence of 
human DNA polymerase-(3 at nucleotide positions 886-889 
(CCCCTGGG) was utilized. PCR reactions were carried out 
with two complementary primers that flank the sequence 
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mismatched template containing a terminally mismatched 
primer with AA at the 3* terminal position. The AA would 
be across from the CC (underlined) in the template 
strand. In these studies, the ratio of templates 
5 containing the complementary and non-complementary 
sequences were varied. The PCR amplified product was 
separated by polyacrylamide gel electrophoresis and 
quantitated by phosphoimage analysis. Wild type Taq DNA 
polymerase detected one molecule of template containing a 

10 TT substitution in place of the two template CC when 

present in a population of 10 5 molecules containing the 
non-mutant templates with the CC substitution. In 
contrast, both of the high fidelity Taq DNA polymerase 
mutants, with substitutions Phe667Tyr and Arg659Ser, 

15 detected one molecule of the TT template amongst 10 8 
molecules of the CC template when the primer contained 
two terminal 3 1 -AA nucleotide residues. 

These results show that high fidelity Taq DNA 
polymerase mutants have two to three orders of magnitude 
20 enhanced sensitivity for detecting mutant DNA using a 
mismatch PCR-based assay. 

EXAMPLE VII 

High Fidelity Tag DNA Polymeras e Mutants Enhance 
Sensitivity of Detectio n of Repetitive DNA Sequences 

25 This example demonstrates the use of high 

fidelity polymerase mutants to enhance the sensitivity 
and accuracy of amplifying repetitive DNA sequences. 

Detection of the length of unstable 
microsatellite DNA in certain human tumors has depended 
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gels. Due to the slippage of DNA polymerase while 
copying repetitive DNA, the interpretation of the results 
of this method have remained unsatisfactory. 

High fidelity Taq DNA polymerases are 
5 identified using the methods described in Examples I and 
III. DNA templates containing runs of CA repeats with 
the number of repeats varying from 5 to 50 are used to 
test high fidelity Taq DNA polymerase mutants. After 20 
to 70 rounds of PCR amplification, the product of the 

10 reaction is displayed on polyacrylamide gels. High 

fidelity polymerase mutants which display less slippage 
errors copying the repetitive sequences are identified. 
These high fidelity polymerase mutants are used to 
amplify repetitive DNA sequences in samples, for example 

15 tissue or tumor samples; 

These results show that high fidelity mutants 
having enhanced sensitivity and accuracy in amplifying 
repetitive DNA sequences can be identified and used to 
amplify repetitive DNA in tissue or tumor samples. 

20 Throughout this application various 

publications have been referenced. The disclosures of 
these publications in their entireties are hereby 
incorporated by reference in this application in order to 
more fully describe the state of the art to which this 

25 invention pertains. 

Although the invention has been described with 
reference to the disclosed embodiments, those skilled in 
the art will readily appreciate that the specific 
experiments detailed are only illustrative of the 
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modifications can be made without departing from the 
spirit of the invention. 



WO 98/23733 

We claim: 



58 



PCT/US97/21940 



1. A method for identifying a thermostable 
polymerase having altered fidelity, comprising generating 
a random population of polymerase mutants by mutating one 

5 or more amino acid residues adjacent to an immutable or 
nearly immutable residue in an active site O-helix of a 
thermostable polymerase and screening said population for 
one or more active polymerase mutants. 

2. The method of claim 1, further comprising 
10 determining a fidelity of said active polymerase mutant. 

3. The method of claim 1, wherein said one or 
more amino acid residues is immediately adjacent to an 
immutable or nearly immutable residue. 

4. The method of claim 1, wherein said one or 
15 more amino acid residues is adjacent to an amino acid 

residue corresponding to Arg659, Lys663, Phe667 or Tyr671 
in Tag DNA polymerase. 

5. The method of claim 4, wherein said 
thermostable polymerase is Taq DNA polymerase. 

20 6 . An isolated thermostable polymerase mutant 

having altered fidelity, wherein said polymerase mutant 
comprises one or more mutated amino acid residues 
adjacent to an immutable or nearly immutable residue in 
the active site O-helix of a thermostable polymerase. 

25 7. The polymerase mutant of claim 6, wherein 

said polymerase is Taq DNA polymerase. 
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8. The polymerase mutant of claim 6, wherein 
said one or more amino acid residues is immediately 
adjacent to an immutable or nearly immutable residue. 

9. The polymerase mutant of claim 6, wherein 
said mutated amino acid residue is adjacent to an amino 
acid residue corresponding to Arg659, Lys663, Phe667 or 
Tyr671 in Taq DNA polymerase. 

10. The polymerase mutant of claim 9, wherein 
said polymerase is Taq DNA polymerase, 

11. The polymerase mutant of claim 7, wherein 
said mutant is a high fidelity mutant. 

12. The polymerase mutant of claim 11, wherein 
said polymerase mutant comprises one or more amino acid 
substitutions selected from the group consisting of 
Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; Gly668Ser; 
and Gly668Gln. 

13. The polymerase mutant of claim 7, wherein 
said mutant is a low fidelity mutant. 

14. The polymerase mutant of claim 13, wherein 
said polymerase mutant comprises substitution of one or 
more amino acids selected from the group consisting of 
Ala661, Thr664, Asn666 and Leu670. 

15. An isolated nucleic acid molecule encoding 
a polymerase mutant having high fidelity, comprising a 
nucleotide sequence encoding substantially an amino acid 
sequence of Taq DNA polymerase I comprising one or more 
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consisting of Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; 
Gly668Ser; and Gly668Gln. 

16. An isolated nucleic acid molecule encoding 
a polymerase mutant having low fidelity, comprising a 

5 nucleotide sequence encoding substantially an amino acid 
sequence of Tag DNA polymerase I comprising substitution 
of one or more amino acids selected from the group 
consisting of Ala661, Thr664, Asn666 and Leu670. 

17. A method for identifying one or more 
10 mutations in a gene, comprising amplifying said gene 

using a high fidelity polymerase mutant under conditions 
which allow polymerase chain reaction amplification. 

18. A method for identifying one or more 
mutations in a gene, comprising amplifying said gene 

15 using the high fidelity polymerase mutant of claim 11 
under conditions which allow polymerase chain reaction 
amplification . 

19. The method of claim 17, wherein said gene 
is amplified by exposing the strands of said gene to 

20 repeated cycles of denaturing, annealing and elongation 
to produce an amplified product. 

20. The method of claim 19, further comprising 
determining the presence or absence of one or more 
mutations in the sequence of said gene. 

25 21. The method of claim 17, wherein said 

polymerase mutant comprises one or more amino acid 
substitutions selected from the group consisting of 



WO 98/23733 



PCT/US97/21940 



61 

22. A method for accurately copying repetitive 
nucleotide sequences, comprising amplifying said 
repetitive nucleotide sequence using a high fidelity 
polymerase mutant. 

23. The method of claim 22, wherein said 
repetitive nucleotide sequence is in a gene. 

24. The method of claim 22, wherein said 
repetitive nucleotide sequence is in a microsatellite 
between genes. 

25. A method for accurately copying repetitive 
nucleotide sequences, comprising amplifying said 
repetitive nucleotide sequence using said high fidelity 
polymerase mutant of claim 11. 

26. A method for determining an inherited 
mutation, comprising amplifying a gene using a high 
fidelity polymerase mutant. 

27. A method for diagnosing a genetic disease, 
comprising correlating the inherited mutation determined 
in claim 26 with said genetic disease. 

28. A method for diagnosing a genetic disease, 
comprising amplifying a gene using a high fidelity 
polymerase mutant. 

29. A method for diagnosing a genetic disease, 
comprising amplifying a gene using said high fidelity 
polymerase mutant of claim 11. 
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30. The method of claim 28, wherein said 
genetic disease comprises mutations in microsatelli te or 
repetitive DNA. 

31. The method of claim 30, wherein said 
genetic disease is cancer. 

32. A method for determining the prognosis of 
a genetic disease, comprising amplifying said gene in 
claim 28. 

33. The method of claim 28, wherein said 
polymerase mutant comprises one or more amino acid 
substitutions selected from the group consisting of 
Arg660Tyr; Arg660Ser; Gly668Arg; Arg660Lys; Gly668Ser; 
and Gly668Gln. 

34. A method for randomly mutagenizing a gene, 
comprising amplifying said gene using a low fidelity 
polymerase mutant. 

35. A method for randomly mutagenizing a gene, 
comprising amplifying said gene using said low fidelity 
polymerase mutant of claim 13. 

36. The method of claim 35, wherein said 
polymerase mutant comprises substitution of one or more 
amino acid residues selected from the group consisting of 
Ala661, Thr664, Asn666 and Leu670. 
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AAGCTCAGAT CTACCTGCCT GAGGGCGTCC GGTTCCAGCT GGCCCTTCCC GAGGGGGAGA 60 
GGGAGGCGTT TCTAAAAGCC CTTCAGGACG CTACCCGGGG GCGGGTGGTG GAAGGGTAAC 120 

SIS £ GG £ GG fi TG . CTG £ CC CTC II T 6AG ccc AAG GG C CGG GTC CTC CTG 168 
Met .Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

GTG GAC GGC CAC CAC CTG GCC TAC CGC ACC TTC CAC GCC CTG AAG GGC 216 
Val Asp Gly His His Leu Ala Tyf Arg Thr Phe His Ala Leu Lys Gly 
20 25 30 

CTC ACC ACC AGC CGG GGG GAG CCG GTG CAG GCG GTC TAC GGC TTC GCC 26*1 
Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 i|5 

AAG AGC CTC CTC AAG GCC CTC AAG GAG GAC GGG GAC GCG GTG ATC GTG 312 
Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val He Val 
->U 55 60 

GTC TTT GAC GCC AAG GCC CCC TCC TTC CGC CAC GAG GCC TAC GGG GGG 360 
Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly 
55 70 75 80 

TAC AAG GCG GGC CGG GCC CCC ACG CCG GAG GAC TTT CCC CGG CAA CTC 408 
Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu 
85 90 95 

GCC CTC ATC AAG GAG CTG GTG GAC CTC CTG GGG CTG GCG CGC CTC GAG 456 
Ala Leu He Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu 
100 105 110 

GTC CCG GGC TAC GAG GCG GAC GAC GTC CTG GCC AGC CTG GCC AAG AAG 501 
Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys 
115 120 125 

GCG GAA AAG GAG GGC TAC GAG GTC CGC ATC CTC ACC GCC GAC AAA GAC 552 
Ala Glu Lys Glu Gly Tyr Glu Val Arg He Leu Thr Ala Asp Lys Asp 

130 135 mo 

CTT TAC CAG CTC CTT TCC GAC CGC ATC CAC GTC CTC CAC CCC GAG GGG 600 
Leu Tyr Gin Leu Leu Ser Asp Arg He His Val Leu His Pro Glu Gly 

150 155 160 

TAC CTC ATC ACC CCG GCC TGG CTT TGG GAA AAG TAC GGC CTG AGG CCC 648 
Tyr Leu lie Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro 
165 170 175 

§5£ £?5 SfS S AC T AC £ 6G § CC CTG & c GGG GAC GAG TCC GAC AAC 696 
Asp Gin Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn 
180 185 190 

ill d% £ 6G G , T( r AAG S GC A T C S GG GAG AAG AGG GCG AGG AAG CTT CTG 744 
Leu Pro Gly Val Lys Gly He Gly Glu Lys Thr Ala Arg Lys Leu Leu 
195 200 205 

RG ^ A 
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GAG GAG TGG 
Glu GLu Trp 
210 



GGG AGC CTG GAA GCC CTC CTC AAG AAC CTG GAC CGG CTG 
Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu 
215 220 



AAG CCC GCC ATC CGG GAG AAG ATC CTG GCC CAC ATG GAC GAT CTG AAG 
Lyj Pro Ala He Arg Gly Lys He Leu Ala His Met Asp Asp Leu 



230 



235 



Lys 
240 



CTC TCC TGG 
Leu Ser Trp 

GAC TTC GCC 
Asp Phe Ala 



CTG GAG AGG 
Leu Glu Arg 
275 

GAA AGC CCC 
Glu Ser Pro 
290 

GCC TTC GTG 
Ala Phe Val 
305 

CTT CTG GCC 
Leu Leu Ala 



GAC CTG GCC AAG GTG CGC ACC GAC CTG CCC CTG GAG GTG 
Asp Ley Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val 
215 250 255 



AAA AGG 
Lys Arg 
260 

CTT GAG 
Leu Glu 



CGG GAG CCC GAC CGG GAG AGG CTT AGG GCC TTT 
Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe 
265 270 



AAG GCC 
Lys Ala 

GGC TTT 
Gly Phe 



GAG CCT TAT 
Glu Pro Tyr 



GCC AAA GAC 
Ala Lys Asp 
355 

CCC GGC GAC 
Pro Gly Asp 
370 

ACC ACC CCC 
Thr Thr Pro 
385 

GAG GCG GGG 
Glu Ala Gly 



CTG GCC 
Leu Ala 
325 

AAA GCC 
Lys Ala 
340 

CTG AGC 
Leu Ser 



TTT GGC AGC CTC CTC CAC GAG TTC GGC CTT CTG 
Phe Gly Ser Leu Leu His Glu Phe Glu Leu Leu 
280 285 

CTG GAG GAG GCC CCC TGG CCC CCG CCG GAA GGG 
Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly 
295 300 

GTG CTT TCC CGC AAG GAG CCC ATG TGG GCC GAT 
Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp 
310 315 320 

GCC GCC AGG GGG GGC CGG GTC CAC CGG GCC CCC 
Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro 
330 335 

CTC AGG GAC CTG AAG GAG GCG CGG GGG CTT CTC 
Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu 



345 



35E 



GAC CCC 
Asp Pro 



GAG GGG 
Glu Gly 



TGG GGG AGG 
Trp Gly Arg 



GAG CGG 
Glu Arg 
105 

CTT GAG 
Leu Glu 
420 



GTT CTG GCC 
Val Leu Ala 
360 

ATG CTC CTC 
Met Leu Leu 
375 

GTG GCC CGG 
Val Ala Arg 
390 

GCC GCC CTT 
Ala Ala Leu 



GGG GAG GAG 
Gly Glu Glu 



CTG AGG GAA GGC CTT GGC CTC CCG 
Leu Arg Glu 61y Leu Gly Leu Pro 
365 

GCC TAC CTC CTG GAC CCT TCC AAC 
Ala Tyr Leu Leu Asp Pro Ser Asn 
380 

CGC TAC GGC GGG GAG TGG ACG GAG 
Arg Tyr Gly Gly Glu Trp Thr Glu 
395 400 

TCC GAG AGG CTC TTC GCC AAC CTG 
Ser Glu Arg Leu Phe Ala Asn Leu 
410 415 

AGG CTC CTT TGG CTT TAC CGG GAG 
Arg Leu Leu Trp Leu Tyr Arg Glu 
425 430 



792 
840 
888 
936 
984 
1032 
1080 
1128 
1176 
1224 
1272 
1320 
1368 
1416 
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GT6 6AG 
Val Glu 



GTG CGC 
Val Arg 
450 

GAG GAG 
Glu Glu 
465 

CCC TTC 
Pro Phe 



GAG CTA 
Glu Leu 



TCC ACC 
Ser Thr 



GTG GAG 
Val Glu 
530 

TAC ATT 
Tyr He 
545 

CAC ACC 
His Thr 



TCC GAT 
Ser Asp 



AGG ATC 
Arg He 



CTG GAC 
Leu Asp 
610 

GAC GAG 
Asp Glu 
625 

GAG ACC 
Glu Thr 



AGG CCC 
Arg Pro 
435 

CTG GAC 
Leu Asp 



ATC GCC 
lie Ala 



AAC CTC 
Asn Leu 



GGG CTT 
Gly Leu 
500 

AGC GCC 

Ser Ala 
515 

AAG ATC 

Lys He 



GAC CCC 
Asp Pro 



CGC TTC 
Arg Phe 



CCC AAC 
Pro Asn 
580 

CGC CGG 
Arg Arg 
595 

TAT AGC 
Tyr Ser 



AAC CTG 
Asn Leu 



GCC AGC 
Ala Ser 



CTT TCC 
Leu Ser 



GTG GCC 
Val Ala 



CGC CTC 
Arg Leu 
470 

AAC TCC 
Asn Ser 
485 

CCC GCC 
Pro Ala 



GCC GTC 
Ala Val 



CTG CAG 
Leu Gin 



TTG CCG 
Leu Pro 
550 

AAC CAG 
Asn Gin 
565 

CTC CAG 
Leu Gin 



GCT GTC CTG 
Ala Val Leu 
440 

TAT CTC AGG 
Tyr Leu Arg 
455 

GAG GCC GAG 
Glu Ala Glu 



CGG GAC CAG 
Arg Asp Gin 



ATC GGC AAG 
He Gly Lys 
505 

CTG GAG GCC 
Leu Glu Ala 
520 



GCC CAC 
Ala His 



GCC TTG 
Ala Leu 



TAC CGG GAG 
r 



Tyr Arg Glu 
535 



GAC CTC ATC 
Asp Leu He 



ACG GCC ACG 
Thr Ala Thr 



GCC TTC 
Ala Phe 



CAG ATA 
Gin lie 



ATC CGG 
He Ar 
63 



TGG ATG 
Trp Met 
645 



AAC ATC CCC 
Asn He Pro 
585 

ATC GCC GAG 
He Ala Glu 
600 

GAG CTC AGG 
Glu Leu Arg 
615 

GTC TTC CAG 
Val Phe Gin 



TTC GGC GTC 
Phe Gly Val 



GTC TTC 
Val Phe 
475 

CTG GAA 
Leu Glu 
490 

ACG GAG 
Thr Glu 



CTC CGC 
Leu Arg 



CTC ACC 
Leu Thr 



CAC CCC 
His Pro 
555 

GCC ACG 
Ala Thr 
570 

GTC CGC 
Val Arg 



GAG GGG 
Glu Gly 



GTG CTG 
Val Leu 



GAG GGG 
Glu Gly 
635 

CCC CGG 
Pro Arg 
650 



ATG GAG 
Met Glu 
445 

TCC CTG 
Ser Leu 
460 

CGC CTG 
Arg Leu 



GCC ACG GGG 1464 
Ala Thr Gly 



GAG GTG GCC 1512 
Glu Val Ala 



AGG GTC 
Arg Val 



GCC GGC CAC 1560 

Ala Gly His 
480 

CTC TTT GAC 1608 

Leu Phe Asp 
495 

AAG ACC GGC AAG CGC 1656 

Lys Thr Gly Lys Arg 
510 

CAC CCC ATC 1704 

His Pro He 



GAG GCC 
Glu Ala 
525 

AAG CTG 
Lys Leu 
540 

AGG ACG 
Arg Thr 



AAG AGC ACC 1752 
Lys Ser Thr 



GGC CGC CTC 1800 
Gly Arg Leu 
560 

GGC AGG CTA AGT AGC 1848 
Gly Arg Leu Ser Ser 
575 

ACC CCG CTT GGG CAG 1896 
Thr Pro Leu Gly Gin 
590 

TTG GTG GCC 1944 
Leu Val Ala 



TGG CTA 
Trp Leu 
605 

GCC CAC 
Ala His 
620 

CGG GAC 
Arg Asp 



GAG GCC 
Glu Ala 



CTC TCC GGC 1992 
Leu Ser Gly 



ATC CAC ACG 2040 
He His Thr 
640 

GTG GAC CCC 2088 
Val Asp Pro 
655 
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CTG ATG CGC CGG GCG GCC AAG ACC ATC AAC TTC GGG GTC CTC TAC GGC 2136 
Leu Met Arg Arg Ala Ala Lys Thr He Asn Phe Gly Val Leu Tyr Gly 
660 665 670 

ATG TCG GCC CAC CGC CTC TCC CAG GAG CTA GCC ATC CCT TAC GAG GAG 2181 
Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr Glu Glu 
675 680 685 

GCC CAG GCC TTC ATT GAG CGC TAC TTT CAG AGC TTC CCC AAG GTG CGG 2232 
Ala Gin Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg 
690 695 700 

GCC TGG ATT GAG AAG ACC CTG GAG GAG GGC AGG AGG CGG GGG TAC GTG 2280 
Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val 
705 710 715 720 

GAG ACC CTC TTC GGC CGC CGC CGC TAC GTG CCA GAC CTA GAG GCC CGG 2328 
Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg 
725 730 735 

GTG AAG AGC GTG CGG GAG GCG GCC GAG CGC ATG GCC TTC AAC ATG CCC 2376 
Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro 
740 745 750 

GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTG GCT ATG GTG AAG CTC 2424 
Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu 
755 760 765 

TTC CCC AGG CTG GAG GAA ATG GGG GCC AGG ATG CTC CTT CAG GTC CAC 2472 
Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gin Val His 
770 775 780 

GAC GAG CTG GTC CTC GAG GCC CCA AAA GAG AGG GCG GAG GCC GTG GCC 2520 
Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala 
785 790 795 800 

CGG CTG GCC AAG GAG GTC ATG GAG GGG GTG TAT CCC CTG GCC GTG CCC 2568 
Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro 
805 810 815 

CTG GAG GTG GAG GTG GGG ATA GGG GAG GAC TGG CTC TCC GCC AAG GAG 2616 
Leu Glu Val Glu Val Gly He Gly Glu Asp Trp Leu Ser Ala Lys Glu 
820 825 830 

TGATACCACC 2626 

FIG. 1D 
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